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Exponential random graphs as models of overlay networks 

M. Draief * A. Ganesh ^ and L. Massoulie * 



Abstract 



In this paper, we give an analytic solution for graphs with n nodes and E edges for which the 
probabiUty of obtaining a given graph G is /i(G) = e~''^»=i ''^ , wherer di is the degree of node i. We 
, describe how this model naturally appears in the context of load balancing in communication networks, 

namely Peer-to-Peer overlays. We then analyse the degree distribution of such graphs and show that 
the degrees are concentrated around their mean value. Finally, we derive asymptotic results on the 
number of edges crossing a graph cut and use these results (i) to compute the graph expansion and 
conductance, and (ii) to analyse the graph resilience to random failures. 



AMS classification: 60K35,60F15,68R10,90B18,05C07,05C80,05C85,05C90 

Keywords: Exponential random graphs, Peer-to-Peer networks, overlay optimisation, load balancing, 
degree distribution, graph cut, expansion, conductance, failure resilience. 



1 Introduction 

> 

^■f^ ' Random graphs provide a way of modelling large and complex networks, and of studying stochastic pro- 

, cesses on such networks. Early work on this topic goes back to the famous random graph or Bernoulli 

graph introduced by Solomonoff and Rapoport [23| in the early 1950s and studied by Erdos-Renyi [7j a 
decade later. The Bernoulli random graph model is, however, rather simplistic and fails to capture impor- 
' tant features of many real- world networks. This has stimulated work on a number of other random graph 

models. Exponential random graphs were first introduced in the early 1980s by Holland and Leinhardt 
[2] based on the work of Besag [2] . More recently Frank and Strauss |8] studied a subclass of these graphs 
namely Markov graphs. They correspond to log-linear statistical models of random graphs with general 
dependence structure and Markov dependence ^4^ widely used by statisticians and social network analysts 



^ ' To motivate the study of such graphs, we consider the situation where we have measurements of a 

5^ , number of network properties, or observables, for a real- world network, and wish to come up with a network 

model that exhibits similar properties. Denote these observables by {xi)i=i^,,,^k and denote by (5;i)i=i,...,fc 
their measured average value. Let S a set of graphs, and let G be a graph in S. To describe a family of 
graphs that reproduce the graph's observed properties, we wish to choose a probability distribution /i on 
S such that 

^i{G)x,{G) ^ X, , Vz = l,...,fc (1) 

Geg 



where Xi{G) is the value taken by Xi in the graph G. Clearly, there are infinitely many such probability 
distributions; a popular choice is the one that maximises the Gibbs or Shannon Entropy 

s = -Y,^l{G)\og^l{G) 

Ges 
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subject to ([T]) and the normalising condition X^Geg ^ ^- Introducing Lagrange multipliers one can 
easily show [20] that the maximum entropy is achieved for the distribution 

fe 

MG) = -e-^C^), H{G)=J2eMG), (2) 

1=1 

and Z — X^ggs is the normalising constant. Graphs drawn according to distributions defined by 

(PI are called exponential random graphs. Thus, they are random graphs with maximum entropy subject 
to the specified constraints. 

Exponential random graphs can be generated using suitable random walks on the space of graphs, for 
which they arise as the stationary distribution. More precisely, given H{G), a cost or energy function 
associated with the graph G, define the Markov chain on S with transition 

PG,G'=min(l,e-(^(«')-^('^))) . 

It can easily be shown that the transition matrix fulfills the detailed balance condition (the Markov chain 
is reversible) and the corresponding stationary distribution is given by the Boltzmann type probability 
distribution /x(G') = Z^-^e'"^^''. 

In this paper, we study the particular case of graphs with n nodes and E edges for which H{G) — 
^j^^i df, where di is the degree of node i. This model naturally appears in the context of load balancing in 
certain communication networks, namely peer-to-peer overlays. Such overlays are used to support many 
popular file-sharing applications on the Internet. A primary objective in designing such overlays is to 
ensure connectivity of the resulting graph even in the face of node and edge disconnections. We can model 
an overlay as a graph with n nodes representing the peers connected by edges describing whether two 
peers know each other or not. We assume that the "who knows who" relationship is symmetric, i.e., the 
graph is undirected. In [9], an algorithm is described that ensures the construction of an Erdos-Renyi-like 
overlay, wherein any pair of peers is connected with a given probability independently from other pairs. It 
is known that such graphs are connected if the mean degree of nodes is of order higher than log n [5] , and 
the result is true for more general graphs [T]. In |10| . the exponential random graph model with energy 
function H{G) — X]i=i ^^f proposed as a mechanism for achieving better load balancing and greater 
resilience to random link failures. □ 

The outline of the rest of the paper is as follows. We analyse the degree distribution of such graphs 
in section [5] and show that the degrees are concentrated around their mean value with high probability 
(whp) . In section |21 we derive asymptotic results on the number of edges crossing a graph cut and use 
these results (i) to compute the graph expansion and conductance in paragraph 13.11 and {ii) to analyse 
the graph resilience to random failures in paragraph 13.21 



2 Degree distribution 

We work with labelled graphs throughout. We consider the following random graph model on n nodes 
with E edges: 

/i„(G) = |exp (^-/5E^?^l{Er.,d.=2i5}, (3) 

where di denotes the degree of node i in the graph G, is a specified parameter, and Z is a normalizing 
constant. 

Our aim in this section is to show that graphs generated according to ([3]), with 2E — cnlogn, have a 
sequence of degrees that are concentrated around their mean value. 

The probability measure /i„ on graphs induces a probability measure on degree distributions, which 
we denote by 7r„. For d = (di, . . . , fi„), 

7r„(d) = ±G„(d)e-^^"^i'^a{j:._^rf,=2S}, (4) 

'^This paper expands on an earlier short version which appeared in the proceedings of the 41st AUerton Conference on 
Communications, Control and Computing jll) . 
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where G„(d) is the number of graphs having the degree sequence d, and Z„ is a normahzing constant. 
We can rewrite the above as 



7r„(d) 



1 



Znil) 
G„(d) 



n 
i—l 



n 



Znil) fj^ di 

The introduction of the tilt parameter 7 does not change the distribution as it muhiphes 7r„(d) by e'^n-Eiogn^ 
This is a constant since the total number of edges is fixed. Thus, it can be absorbed into the normalization 
factor Znil) along with the term El2^/i2E)\. 

To construct a graph with a given degree distribution, we use the standard configuration model _3 : To 
each node i we associate di labelled half-edges, also called configuration points or stubs. All stubs need 
to be matched to construct the graph, this is done by randomly connecting them. When a stub of i is 
matched with a stub of j, we interpret this as an edge between i and j. The graph obtained following 
this procedure may not be simple, i.e., may contain self-loops due to the matching of two stubs of i, and 
multi-edges due to the existence of more than one matching between two given nodes. 

To restrict ourselves to the family of simple graphs we define the erased configuration model. Starting 
from the multigraph obtained through the configuration model, we merge all multiple edges into a single 
edge and erase all self- loops. It is shown in [24] . that provided that the maximum degree of the graph c?max 
is such that d^ax — oi\/n), the configuration model and the erased configuration model are asymptotically 
equivalent, in probability, and every simple graph thus obtained corresponds exactly to HILi di^- distinct 
configurations describing the number of ways stubs are assigned. We will show in Theorem [1] that the 
above condition is indeed satisfied. 

We denote the minimum and maximum degrees by dmin and dmax respectively. The parameter G'n(d) 
introduced above corresponds to the probability of obtaining a simple graph in the configuration model. 
This implies the upper bound G„(d) < 1 for any degree sequence d. Moreover, if c?max = oiE^^^), then 
McKay and Wormald 17J establish the equivalence, for n large, 

1 " 

G„(d) ~ e-^-^\ where A = — " !)■ (6) 

Given a degree sequence d, we define the mean degree d = J2^=idi/n and the variance Var(d) = 
^ J2^=iidi — d)'^. We are interested in a regime where d — clogn for some specified constant c, so that 
E = cn log n/2. 

For fixed constants ai and 02, we define the following sets of degree sequences: 

A — {d: d — clogn}, 
Ai(ai,Q!2) = {d : -v/ai logn < - d < v^a2 logn , Vi = 1, . . . ,71}, 

Note that, in the regime d — clogn, 7r„ is supported on A, and so 7r„(i3) — 7r„(A n B) for any set B of 
labelled graphs on n nodes. Define Ai(ai, 02) = ^ n ^i(ai, 0:2). We wish to show that 

Theorem 1. There exist constants ai,a2 such that 7r„(yli(Q;i, q;2)) goes to 1 as n goes to infinity. 

The above theorem states that for the random graph model defined by the distribution ([3]), the node 
degrees concentrate about their mean value. Specifically, all node degrees are within order \/Iogn of the 
mean, whp. This is in contrast to the Erdos-Renyi model (with the same number of edges) where the 
maximum fluctuation of node degrees is typically of order log n. The rest of the section is devoted to the 
proof of this theorem. To this end, we start by proving that 

Theorem 2. Define the event A2 = {A : di < v}!^ , Vi = 1, . . . , n}. Then 

7r„(A2) — > 0, as n ^ 00 , 

and the estimate in (0) holds. 
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To prove this we first state a series of lemmas which are proved in Appendix 3] 

If d e Ai(ai, 02), then in the regime E = cnlogn/2, we have dmax = o{E^/^). Observe from © that 
2 

4:E\ — n(Var(d) + d — d). Moreover, for d € Ai(ai, q;2), we have Var(d) < maxjai, 0:2} logn, so that 

A < ^(clogn - 1 + i max{Q!i, 012}) . (7) 

Hence, 

deii(ai,a2) ^ ;^4:-^e^+^' <e"^, (8) 

for all n sufhciently large. Recall that G„(d) < 1 for all d and, in particular, for d e ^4.2, the complement 
of A2. Thus, it follows from ([5]) and ([8]) that, for n sufficiently large, 

f Ac\ , , V J_p-/9<i=+7(logn)di 



Let Di, . . . , Dn be independent and identically distributed (iid) random variables, with 

P(Di = fc) = 1 g-/3fe'+7(logn)fe^ ^ g /^QN 

F(7) /c! 

where ^"(7) is a normalization constant. The dependence of the Di on rt and 7 has not been made explicit 
in the notation. We choose 7 so that EDi — clogn, for a specified constant, c; this is possible by the 
following lemma. 

Lemma 1. Let = ^ ^7 log n + log log 71 + 5^^; md let k-y — 1 denote the integer part of x-y. Then, 
EDi — and Var{Di) remain bounded as n tends to 00. Moreover, let a — 213 {x^ — A;^ + ^) and 

y-00 ej-/3/ 



Z^7 = -oo ^ 



■I] 

Then, the moment generating function of Di satisfies 

Proof: See proof in section H7T] □ 
Let D denote the random vector . . . , Dn). We can now rewrite ^ as 

^„(Ai) ^ p(D e A^) ^^^^ 



7r„(^i(ai,a2)) P(D e Ai(ai, ^2)) 

Lemma 2. There exists a constant K > 0, independent of n, such that 

P(D e A^) < Kne"'^^'/^ . (12 



Proof: See proof in section [l?2l □ 
Let (-Di, . . . , have the joint distribution of (-Di, . . . , -D„) conditional on D G Ai(ai, a2)- Equiva- 
lently, Z?i, . . . , Z?„ are iid, with having the distribution of Dj conditional on 



— \/ai logn < Dj — E_Dj < a-i log ? 

Now 



Deii(ai,a2)) = P(D e Ai(ai,a2)) P ^L'j =c7ilogn I D e yli(ai,a2) 



'(D e Ai(ai,a2)) P I = cn logn | (13) 
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Suppose ai,a2 > are chosen large enough so that, for n large, EZ)i = E_Di = clogn. 

We wish to estimate the probability that Di + D2 + ■ ■ ■ + Dn = cn log n. We shall do this using a result 
from (16] . For j = 1, . . . , n, define the centred random variables, X„j = Dj — KDj; we have made the 
dependence of the distribution of Dj on n explicit in the notation. Thus, X„i,X„2, . . . is an array 

of integer- valued zero mean random variables such that, for each n, . . . are independent and 

identically distributed. Now, to apply ^16^ Theorem 1], we need the following result. 

Lemma 3. The random variables, {Xnj,j = 1, . . . ,7i,n e N}, satisfy the following conditions: 
(i) Hmsup„^gc E[e^l"^"il] < 00 for some 6 > 0. 
(a) liminfji^oo ^^K^ni) > 0- 

(ill) liminf„^oo E^l-oo min{P(X„i = j),V{Xni - j + 1)} > 0. 



Proof: See proof in section [4731 □ 
Indeed, an immediate corollary of [16, Theorem 1] is that 

Theorem 3. // a sequence of independent random variables, {Xnj,j = l,...,n,n € N}, satisfies the 
conditions (i), (ii) and (Hi) of Lemma\^ then 



I '27tJ:" Var{X„,)^ ^^^^ 




A direct application of the above result yields 

where a = Var(I?i) remains bounded as n ^ 00. Combining this with pT]) . IT^ and (|13p . we get 

Lemma 4. Let D denote the random vector . . . , -D„). Given any K > 0, we can choose a\ and 0L2 
such that P(D G Ai{ai,a2Y) < e~^'°s" for all n sufficiently large. 



Proof: See proof in section [431 D 
Combining the above lemma with the bound in (jlSp . it is immediate that 7r„(A2) as n ^ 00 which 
establishes the claim of Theorem [21 Thus, to prove Theorem [1] we can restrict our attention to graphs 
with degree sequences in A2, for which we can use the estimate in (|6]). 
Proof of Theorem [II Observe that 

7r„(Ai(ai,Q;2)) = 7r„(A) - 7r„(A \ yli(ai, 02)) 

> 7r„(A) - Tr„{{A \ Ai(ai, 02)) n A2) - 7r„(A^). 

But 7r„(A) = 1 by definition, and we have shown above that 7r„(A2) — > as n — > 00. Hence, it suffices to 
show that 

7r„((A\ Ai(q!i,Q!2)) n A2) 0, asn^cx). (16) 
Recall from © that, if d e A2, then G„(d) - e-^^'^^-^f'^)" . Now, 

Var(d) + d^-d^clogn-l ^ 
^ ^ 2d - 2 ' 

since the mean degree, d = clogn. In particular, the above lower bound on A(d) holds for all degree 
sequence d in (A \ Ai{ai, 02)) D A2, since this is a subset of A. 
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In addition, we saw earlier in ([7]) that, if d G Ai(ai, 02), then 

A(d) < i(^clogn - 1 + i niax{ai, 02}^ , 

and the estimate in ^ holds. 

Denote max{ai,Q;2} by a. Now, by (O, 

ttA \ A / \\ A \ V p-Md)-\{df yin J_ -/3dJ+7(logn)di 

(Q:i,Q2))nA2 lli=l di\ 



V J_„-/3tiJ'+7(logn)tii 

< ^(clogK+^) Z^dg(A\Ai(ai,a2))nA2 rf,!"" 

- \p _ J_p-/3<i2+7(logn)d, 

Z^deyli(Qi,Q2) di!*^ 



In other words, there are constants ki and K2 such that 

^„((A\^i(ai,a2))nA2) ^ ^.^^.2 log n P(D € (-4 \ (ai , ^2)) n A2) 
7r„(ii(ai,a2)) ~ ^ P(D G a2)) 

< ^^^.,iog„P(D^j4\Ai(ai^ 
P(De ii(ai,a2)) 

Now, by LemmalU for any given ii' > 0, we can choose ai and 02 such that P (D G j4i(q;i, Q!2)'^) < e~^'°s". 
Thus, 

P(D G A \ Ai(ai, ^2)) < P(D G a2)") < e"^'°«". (18) 

Moreover, analogous to HH), we have 

" 1 / /I 
P(D G A) = P(V £>, = cTilogn) = , 1 + O - 

where a — Var(Z?i) remains bounded as n — > 00. Therefore, 

P(D G ii(ai,a2)) = P(D G A) -P(D G An Ai(ai,a2)'=) 
> P(D G A) -P(D G Ai(ai,a2)') 

' ^ + 0fi-)V (19) 



Substituting (HH]) and (HI]) in we have 

7r„(A\ Ai(q:i,q;2)) 



7r„((A\ Ai(ai,a2))n A2) < 



7r„(Al(Q!l, 02)) 

1 



< Ki(j\/27me(''=^-^)'°s" (1 + 



Since if can be chosen arbitrarily large, the above quantity goes to zero as n — > 00, which establishes (jl6p 
and the claim of the theorem. □ 

3 Graph cuts 

Given a graph G and a subset U of its vertex set, let eu{G) denote the number of edges incident within 
U (i.e., having both their vertices with U); let eij^ijc[G) denote the number of edges having one vertex in 
U and the other in its complement, (i.e., crossing the cut ([/, U'^)); and denote by u or \U\ the number 
of vertices or size of U. 
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Let d(G) = {di,d2, ■ ■ ■ ,dn) denote the degree sequence of G and define the volume of a subset of 
vertices U by 

Vol(t/) -^d,. 

ieu 

Note that 

2eu{G)+eu,u^ (G) = Vol({/) . (20) 

In the remainder of this section we derive lower bounds for the graph cuts. To this end we will show that 
there exists a constant S such that eu^uc{G) > (1 — (5)|J7|clogn, whp, using different techniques depending 
on the size of U, when \U\ < n/2. 

Proposition 1. For any e > 0, there exists Si G (0, 1), independent of n, such that, if the subset of 
vertices U is such that u < 2eclogn, then eu,U'={G) > (1 ^ Si)uc\ogn, whp. 

Proof: Denote \U\ by u. Suppose first that u < 2eclogn, for a given e > 0. The number of edges incident 
within U can be at most (2), so eu{G) < euclogn, for all U. Now, for any degree sequence d £ Ai{ai,a2), 
Yo\{U) > CM log n — u^/ai \ogn. By Theorem [l] it is not restrictive to consider only graphs with degree 
sequences belonging to the set Ai{ai, a2)- Hence, using ([20|) for graphs G with such degree sequences, 

eu,U'={G) > u[{l — 2e)clogn — -\/ ai log n]. 

Let 61 ~ 3e. Then, for n sufficiently large, eu^u<:{G) > (1 — 5i)uc\ogn, vi^hp, whenever u < 2eclogn and 
the claim of the proposition is established. □ 

To prove a similar result for all subsets U such that u < n/2 we will use the configuration model [3]. 
Fix a degree sequence d £ Ai{ai, 0:2). By ([3]), all graphs with the same degree sequence are equally likely 
under the distribution /i„, so we can use the configuration model to generate a random graph with this 
distribution, conditional on the degree sequence. 

For constants (5 G (0, 1), e > 0, and r > 0, for n e N and a degree sequence d, we define the following 
subsets of graphs on a vertex set V of cardinality n: 

£i(n,(5,T,d) ^ {G : d(G) = d and e[/,c/<=(G) < (I - (5)uclogn 

for some U CV with 2eclogn < u < tu}, (21) 

82(71, (5, T,d) = {G : d(G) = d and e[/,c/<^(G) < (1 - 5)uclogn 

for some C/ C y with rn < u < n/2}. (22) 

We also define 

£i(n,J,T) = |j£i(n,,5,r,d), £2(71, J, r) = |J £2(n, ,5, r, d). (23) 

d d 

We shall derive bounds on the probabilities of these sets using the configuration model 3]. To this end, 
we define the analogous sets of configurations £i(rt, (5, r, d), £2(71, (5, r, d), £1(71, (5, r) and £2(n,i5,T). It is 
useful to define the following sets of configurations on the same vertex set. More precisely, given a degree 
sequence d = (di, ^2, ■ ■ • , dn), and for H a configuration on V , we define 

£i(n, (5, T, d) — {H : d{H) = d and eu^ifa{H) < {1 — S)uclogn 

for some U CV with 2eclogn < u < rn}, (24) 

£,2{n,6,T,d) = {H : d{H) = d a.nd eu^ua(H) < {1 — S)uclogn 

for some U CV with rn < u < n/2}, (25) 

and 

£i(n, (5,t) = lj£i(n, (5, r, d), £2(71, (5, r) = (J £2(71, (5, r, d). (26) 

d d 

Recall that configurations correspond to multigraphs, i.c, there may be loops or multiple edges. A 
multiple edge is counted the corresponding number of times in the above definitions. 
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Since d e A{ai, a2), estimate ^ holds. Using the enumeration formula of McKay and Worniald [T7], 
this bound says that, for i — 1,2 



At„(£,(n,,5,r,d)|d) < e^+^^P{H G £,(n, 5, r, d)|d), 



(27) 



where P(-|d) denotes the probability with respect to the uniform distribution on configurations with degree 
sequence d. Recall that A was defined in ([6]) to be di{di — l)/4£', where E is the number of edges, 

i.e., 2E = X]r=i '^i- dependence of A on d has been suppressed for notational convenience. 

Proposition 2. If t E (0, j-p^), then there exists §2 G (0, 1), independent of n, such that 

lim /i„(£i(7i, (52,r)) = 0, 

n — >oo 

where the distribution ^„ was defined in {^P- 
Proof : 

For degree sequences d G Ai{ai, 0:2) and any subset U of the vertex set, Vol(J7) ~ uclogrt, for n large. 

Hence, by ([^ . eu^viH) < u{l — S)clogn for a subset U implies that eu{H) > |Vol(L/), for sufficiently 
large n. To prove the proposition it therefore sufffices to show that there exists 62 G (0, 1) such that 
P{eu{H) > ^ Vol(C/)) tends to when n tends to infinity. 

Recall that for subset U of V, the volume of U is given by Vol(C/) = X^igc/"^*- half-edges 
in the configuration model are matched uniformly, eu{H), the number of edges incident within [/ in a 
random configuration, is bounded above by a binomial random variable X with parameters Vol([/) and 
Yo\{U) / {2E — Vol(J7)). The dependence of X on C/ has been suppressed for notational convenience. For 
S G (0, 1), by Chernoff's bound, we have 



logP I X > - Vol(C/) 



< -Vol(;7) 
Vol([/) 



rS ^ |(2£:- Vol(C/)) 



< 



log 



){2E-Yol{U)) 



Yo\{U) 



f(2i?-Vol(C/)) 

log 



Yo\{U) 



,(1 

^ 2' 2E-2Yo\{U) 



+ (l--)log(l--) 



Applying the inequality log x < x — 1 for a; > 1 to a; 
Using the fact that 

Vol(C/) 



we have 



logP 



uc log n 



X > -Vol(C/) j < -uclog7 



1 



< 



= 1/(1- 

1 



2' °' 2 
we have that log(l 



f)> 



-1/(1- 



■log 



c ^/\ogn 

6{n — u) 
2u 



1 + 



Vlogn 



(28) 



Suppose first that 2eclogn < m < ^/n. 

For all n sufficiently large, equation (|28|) becomes 



logP(^X>^Vol(L/)) <-^log^ 



Since X stochastically dominates eu{H) (conditional on d), we have by the union bound that, for n 
sufficiently large. 



3U, 2eclogn < w < V", eu{H) > |vol(C/) j < 



Ef n\ ( uSc , 2 

iec\ogn 

El / , udc , 2 
— exp[u\ogn ^log " 

u—2€c log n 



(29) 



8 



for two constants k^, K4 > 0. We have used the inequahty (") < n^/u\ to obtain the second inequality 
above. 

Next, consider ^yn < u < rn. 

In this case equation (1^5)) becomes, 



logP X > -Vol(C/) < --uclogn 



1 



Slog 



If r < Yq^, then there exists 62 £ (0, 1) such that 



62 log 



62(1 -t) 
2t 



2t 



S2 > -• 



and subsequently, for all n sufficiently large and for u < rn, we have 



logP ( X > ^Vol(C/) ) < -2ulogj 



Hence, by the union bound. 



P [ 3U : < u < rn, euiH) > y Vol(t/) 



U — y^ 



~2u loff n 



< 



rn ^ 



- log n 



By ([771) . (EH) and ([5n)l . for n large, we can find two constants hq.K'j > such that 



^„(£i(n,(52,r,d)|d) < e^+^ kqc 



-K,7 log n 



(30) 



Since A — O(logn), it is readily checked that /i„(£i(n, (52, t, d)|d e ^i(q;i,q;2)) goes to as n 00. 
By Theoreni[TJ /i„(d ^ yli(ai,Q;2)) goes to as well. Noting that 

^„(£i(n, (5,t)) < /i„(£i(7i, (5,T,d)|d e Ai[ai,a2)) + Ai„(d ^ ^i(ai,a2)), 

the claim of the proposition is established. □ 
Next, wc find a similar lower bound for eu,U''{G) that holds, whp, for subsets U with rn < u < n/2. 

Proposition 3. For t > 0, there exists 63 e (0, 1), independent of n, such that 

lim /in(£2(?^, (^SjT)) = 0. 

n — *oo 

Proof: As in the proof of Proposition [2l we fix a degree sequence d and a subset U, and bound the 
probability that eu^viG) < u{l — S)clogn in terms of the probability that eu,u<:{H) < u(l — S)clogn, 
where H is drawn uniformly at random from configurations with degree sequence d, i.e.. 



M„(£2(n,<5,r,d)|d) <e^+^'P(i7 e £2(n,5,T,d)|d), 



(31) 



Fix constants r > and 6 E (0, 1), and a degree sequence d. Let t/ be a subset of the vertex set with 
rn < u < n/2, and let j < (1 — S)uclogn < ^(1 — S)cn\ogn. Recall that the number of configurations 
with degree sequence d is 

{2E)\ " 



Hn{d) = 



E12E 



Y[d 



(32) 
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where E — Y^^=i^i/'^ total number of edges. The number of these configurations with exactly j 

edges crossing the cut between U and W is 



v ^ 



The dependence of J{ on d has been suppressed for notational convenience. The first two terms on the 
right above count the number of ways we can choose j configurations points each from U and to match 
up. The term j! counts the number of ways of matching them. The remaining configuration points have 
to be matched within the sets U and If^ as there are only j edges crossing the cut. The number of ways of 
doing this is the number of configurations on U with Vol(J7) — j points, times the number of configurations 
on with 2E — Vol(C/) — j points, and with a degree sequence strictly bounded by d (since j points each 
in U and W have been used up). This yields the remaining terms in the bound above. We obtain from 
([5^ and ([55]) after some simplification that 

or /■^ C^r .-^ wVol(;7)/2wB-(Vol(;7)/2)N 

^"^"^^ Wo\{u)){i/2) 
Taking logarithms and using Stirling's formula, we get 

1 ,rr^ ^ ^, Voire/) \ Vol(C/), f j \ 2E-Yo\{U), 



2E - Vol([/) 



(^)+0,los.„), (.4) 



--2Eh 

where, for x G [0, 1], h{x) = —x log a; — (1 — 2:) log(l — a;) is the binary entropy of x. Now, 2E — cn log n and, 
since it was assumed that d G A\{oL\^a2), |Vol(C/) — culognj < u^Ja logn, a — max{ai,a2}. Moreover, 
rn < u < n/2, while j < ^(1 — S)cnlogn. Hence, for some Si and for large enough n, we have, for all 
S>Si 

J \ ^ / jl ^ 6)n\ogn \ _ ^ f jl - S) 



yol{U) J \ 2Tnlogn J V 2t 
and it can likewise be shown that, for some 62 and for large enough n, we have, for all 5 > 62 

^ , , \ < h(l-S) . 

2E-Vo\{U)J ^ ' 

On the other hand, as \U\ < n/2, for n large. 

Using the fact that Vol([7) < 2E, for all U, it follows from ((34| that, for n sufficiently large, 

logP{eu^u.{H)^j)<~E(^h{T)-h(^^-^^ -hil-S)^ <-Kn\ogn (35) 

where 5 is chosen big enough so that /i(r) — h {^^) — h{l ~ 6) > 0, i.e., k > 0. 

The above bound applies for all subsets U of V, of size u where n < u < n/2. The number of subsets 
U with cardinality between rn and n/2 is smaller than the total number of subsets, which is 2". Hence, 
by the union bound, 

P{H ; 3 U with Tn<u< n/2 and eu^u^H) = j) < 2"e"''"'°s"_ 
The above holds for each j < ^{1 — S)cn\ogn. Applying the union bound once more, 
P(i7e £2(^,5, T,d)|d) < {l~S)cn\og{n) 2"-ie-""'°s« ^ 
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for all d e Ai{ai, a2)- Substituting this in pTjl and noting that A = O(logn), we see that, for 5 large 
enough 

^^n{^2{n■, S, T, d)|d e Ai(q;i, a2)) — > as n ^ cx) . 
We also know from Theorem [T] that /i„(d ^ Ai{ai,a2)) goes to zero. Since 

M« {£-2{n,S,T)) < /i„ (£2(71, (5,T, d)|d e yli(ai,a2)) + A'„ (d ^ Ai(q;i,q;2)) , 

then, there exists 63 > such that /i„(£2(n, (53, r)) — > as n — > cx), as claimed. □ 
Fix e > and t < 1/(1 + 4e), then by Propositions [H [2] and [3l there exists S, independent of n, which 
is the maximum of 61, 62 and 63 for which the three propositions hold. Hence we have the following lower 
bound for the graph cut, 

Theorem 4. For graphs G drawn according to (0), there exists 5 € (0, 1) such that for U subset ofV with 
u ~ \U\ < n/2, the number of edges crossing the cut (U, W^) is such that 

^u.U" > (1 ^ S)culogn, whp. 
3.1 Conductance and Expansion 

Using Theorem we can easily recover asymptotic results on the conductance and the expansion of a 
graph drawn according to ([3]) , which are relevant for phenomena such as routing congestion analysis [13] , 
the behaviour of random walks in terms of the mixing and cover times TS' , and epidemic threshold [12] . 

Let A = {cLij)i j^i ji be the adjacency matrix of a graph G and D = Diag(di, . . . the diagonal 
matrix of the degree distribution of G. First, we define the isoperimetric constant or expansion of a graph 
G by 

(/) — inf ^'^ 

UCV, u<n/2 U 

It is related to MiL) the second (smallest) eigenvalue of the Laplacian L = D — A oi the graph through 
the following inequality [51 HH] 

< \2{L) < 24>. 



The lower bound in the above inequality is known as the Cheeger's inequality. 
The conductance of a graph G is defined by 

$ = inf 



UGV,Vo\iU)<E Vol(J7) ' 

Let A2(P) be the second (largest) eigenvalue of P the transition matrix of the simple random walk on 
a graph pij — aij/di. By Cheeger's inequality |l5i Theorem 5.3], 

— < 1- A2(F) < $. 

Theorem 5. For graphs G drawn according to (0), and for the constant 5 of Theorem^ we have that the 
expansion (j) and the conductance $ satisfy, 

(1 — ^)clogn < (j) < clogn, (1 — ^) < $ < 1, whp . 

Proof: First note that if c?min is the minimum degree of G, then by Theorem [1] dmin = clogn — ^/ai logn, 
vifhp. Hence, 

(/) < (1 + o(l))clogn, $< (1 + 0(1)), whp. 
The lower bounds follow from Theorem [4] □ 
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3.2 Failure resilience 



In the following, we work with graphs whose degree sequence belongs to the set Ai(ai, for some 
specified ai and a2- We are interested in the probability that the graph remains connected when links 
fail independently with probability p. It is straightforward to compute the probability that a given node 
i becomes isolated due to link failures; it is simply p'^\ Thus, by the union bound, the probability that 
some node becomes isolated is at most 

n 

< „p=i°g»-Vaiiogr, ^ g^pj(^ + clogp) logn - ^/^!n^^\ogp]. 

i=l 

Hence, if clogp < —1 or, equivalently, p < exp(— 1/c), then the probability that some node becomes 
isolated goes to zero as n increases to infinity. 

By way of comparison, consider the classical random graph model of Erdos and Renyi [7j with the same 
mean degree. Here, an edge is present between each pair of nodes with probability clogn/n, independent 
of all other edges. Here we should assume that c > 1 to ensure that the Erdos-Renyi graph is connected, 
whp. After taking failures into account, the edge probability becomes (1 — p)clogn/n, and the presence 
of edges continues to be mutually independent. It is well known for this model that, if (1 — p)c < 1, then 
the graph is disconnected with high probability. Moreover, in a sense that can be made precise, the main 
reason for disconnection when (1 —p)c is "close to" 1 is the isolation of individual nodes. Intuitively, these 
arguments suggest that balanced random graphs can tolerate link failure rates up to e~^^'^ while retaining 
connectivity, whereas classical random graphs can only tolerate failure rates up to (c — l)/c. We now 
rigourously establish a weaker result. 

We shall use Thereom|4]to show that random graphs drawn from the distribution fi„ can tolerate link 

failure rates up to exp ^— ^ , where S is defined in Theorem |4l without losing connectivity. 

Theorem 6. For any p < exp ^ — ) ' graph G chosen at random from the distribution and 
subjected to independent link failures with probability p remains connected, whp. 

Proof : 

Fix p < exp I — - j . For a subset U of the vertex set, let eu,!/" denote the number of edges between 
U and that have not failed. We shall show that, with high probability, eu^v > for all subsets J7, i.e., 
the graph is connected. Now, 

Mn {euMG)^Q\eu.u^{G))=p'"^"^^''\ 
Assume that eu,U''{G) > (1 — S)uc\ogn, for all U C V with u < rn. Hence, 

/i„(3C/ : u < rn, eu,u4G) = 0) < £ fAp(i^S)uciogn_ 

u=l 

Since p < exp (j'-^^jj—g^^ given, then for some e > and n large, p(i-'5)ciogn ^ g--(i+e)iogn_ Using the 
inequality (^) < n'"/u\, we get 

rn 

ti=i 

< exp - 1 < exp (ne-^^+'^^°^"^ - 1 (36) 

which goes to zero as n — > oo. 

Suppose that eu^viG) > (1 — 5)cu\ogn for all J7 C 1/ with rn < u < n/2. Hence, 

fini^U ■.Tn<u<n/2,eu^U''{G)=0)< ^ p(i-5)cuiog„ < 2>(i-^)^'="i°g". (37) 

U:Tn<u<n/2 
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□ 



We see from §^1 and §7^1 that, 

^„(3 U : eu,U''{G) — 0\eu^uc[G) > (1 — S)cu\ogn) as n — + oo. 
Also, by Theorem m 

^J■nieu,U''{G) < (1 - ^)culog?i, WU C V,0 < u < n/2) —> as n oo , 
when G is chosen according to the distribution /i„, which establishes the claim of the theorem. 

References 

F. Ball and A. Barbour. Poisson approximation for some epidemic models, Journal of Applied 
Probability 27, 479-490, 1990. 

J.E. Besag. Spatial interaction and statistical analysis of lattice systems. Journal of the Royal 
Statistical Society Ser. B 36, 192-236, 1974. 

B. BoUobas. Random Graphs, Cambridge Univeristy Press, 2001. 

P. Bremaud. Markov chains, Gibbs fields, Monte Carlo Simulation, and Queues, Springer- Verlag, 
2001. 



Cachelogic research Peer-to-peer in 2005, http://www.cachelogic.com/research/p2p2005.php 



F. Chung. Laplacians of graphs and Cheeger's inequalities, in Combinatorics, Paul Erdos is eighty. 
Vol. 2 (Keszthely, 1993), Bolyai Math. Soc, Budapest, 157-172, 1996. 

P. Erdos and A. Renyi. On the evolution of random graphs. Mat Kutato Int. Kdzl 5, 17-61, 1960. 

O. Frank and D. Strauss. Markov Graphs, Journal Amer. Stat. Assoc. 81, 832-842, 1986. 

A.J. Ganesh, A.-M. Kermarrec, and L. Massoulie. Probabilistic reliable dissemination in large-scale 
systems, IEEE Transactions on Parallel and Distributed Systems 14(3), 248-258, 2003. 

A.J. Ganesh, A.-M. Kermarrec, and L. Massoulie. Network Awareness and Failure Resilience in Self- 
Organising Overlay Networks, in Proceedings IEEE Symposium on Reliable and Distributed Systems, 
47-55, 2003. 

A.J. Ganesh and L. Massoulie. Failure resilience in balanced overlay networks, in Proceedings ^Ist 
Allerton conference on communication, control and computing, 2003. 

A. J. Ganesh, L. Massoulie and D. Towsley. The effect of network topology on the spread of epidemics, 
in Proc. IEEE Infocom, 2005. 

C. Gkantsidis, , M. Mihail and A. Saberi. Conductance and congestion in power law graphs, in 
Proceedings ACM SIGMETRICS, 2003. 

P.W. Holland and S. Leinhardt. An exponential family of probability densities for directed graphs. 
Journal Amer. Stat. Assoc. 76, 33-51, 1981. 

L. Lovasz. Random Walks on Graphs: A Survey, Combinatorics, Paul Erdos is Eighty, Vol. 2 (ed. 

D. Mikls, V. T. Ss, T. Szo"nyi), Jnos Bolyai Mathematical Society, 353-398, 1996. 

D. McDonald. A local limit theorem for large deviations of sums of independent, non-identically 
distributed random variables. Annals of Probability 7, 526-531, 1979. 

B. D. McKay and N.C. Wormald. Asymptotic enumeration by degree sequence of graphs of high 
degree, Europ. J. Combinatorics 11, 565-580, 1990. 



13 



[18] B. Mohar. Some applications of Laplace eigenvalues of graphs, in Graph Symmetry^ G. Hahn, G. 
Sabidussi (Eds.), Kluwer Academic Press, Dordrecht, 225-275, 1997. 

[19] M. E. J. Newman. The structm-e and functions of complex networks, SIAM Review 45, 167-256, 
2003. 

[20] J. Park and M. E. J. Newman. The statistical mechanics of networks, Phys. Rev. E70, 066117, 2004. 

[21] R. Pastor-Satorras and A. Vespignani. Evolution and structure of the Internet, Cambridge University 
Press, 2004. 

[22] T.A.B. Snijders. Markov chain Monte Carlo estimation of exponential random graph models, Journal 
of Social Structure 3(2), 2002. 

[23] R. Solomonoff and A. Rapoport. Connectivity of random nets, Bulletin of Mathematical Biophysics 
13, 107-117, 1951. 

[24] R. van der Hofstad. Random Graphs and Complex Networks, 

|http://www.win.tue.nl~rhofstad/NotesRGCN2008.pdf| 

4 Appendix 

Let Di , . . . , Dn be iid random variables with distribution given by (1101) . Define 

-J oo 

/(j,7) = -e-^^"+^^'^°^", and ^(7) = ^ /(j, 7), (38) 
■'' 3=0 

so that P(L'i j) ^ /(j,7)/F(7). Now, the ratio 

fjj + I77) ^ 1 ^-(2.7 + l)/3+7loK« 

/(j,7) J + 1 

is a decreasing function of j. Define to be the smallest value of j for which /(j + 1, 7) / f{j, 7) < 1, and 
note that the maximum of f{j, 7) over j is attained at kj. Now, fc^ — 1 is the integer part of the (unique) 
solution of the equation 

h{x,j) := -log(a; + 1) - {2x + l)(3 + -flogn = 0. (39) 
It is readily verified that the solution is 

= ^(7log»^ + loglogn+ ^) +0(1). (40) 

Let ky — [xj\ + 1. Then for any j > 0, 

/(fc^ + j + 1, 7) _ 1 ^-/3(2fc-,.+2.; + l)+7logra 

^ fik-r + 1,7) k^ + 1 ^„2/3j < g-2/3j 
f{ki,l) k^+j + l 

where we have used the fact that f{k^ + l,7)//(fc^,7) < 1 to obtain the last inequality. Iterating this 
inequality yields f{k^ + f{k-y,j) < e^^^^^^-^-^ Similarly, we get 

/(fc^-J-1,7) ^ /(^-1,7) / _ _J_\^2f3j < g-2/3j 

f{k^-j,l) f{k^,l) ^ kjJ 
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since f{k-y, 7)//(fc^ — 1, 7) > 1 by the definition of kj. Iterating this inequality yields f{kj—j, j)/ f{kj, 7) < 
g-/3i(i-i)_ Thus, for all integers j > —kj, we have the inequality 

Next, we derive an equivalent for the above ratio. Observe that, for any fixed j, 

fik-r + in) ^ -A7(2fc-,+.7)+7.?log« 

/(fc^,7) (fc^+j)! 

= J_ -/3j(2fc-,+i)+7ilogri/-[^ , Qf P\\ 

ki, V Vfc^yy- 

Taking logarithms, 

log^fei^ = -jlogfc7-/3j(2fc7+j)+7jlogn + o(-^) 
j[k^,l) WognJ 

■ 2 

= jh{x^,-f)+aj - /3f +o(-^), 

V log n / 

where a = 2j3{xj — kj + Note that a S [— /3, (3] for all n because k^ G [x^, x^ + 1]. Since h{xj, 7) = 
by the definition of Xj, we can now write 

g(j,7) :- ^^^7/V^ =(l + A,-)e"^-^^^^ where A, = o(-^) . (42) 
f[ky,j) Vlogn/ 

Thus, by JMl), 

00 

i^(7) = /(fc7,7) E 5a7) = ^o(a,/3)/(/c7,7), (43) 
where Ko{a,P) ~ X]j°=-oo e"^~^^ is bounded uniformly in 7 and n. 

4.1 Proof of Lemma [1] 

We obtain from ^ and (gH) that 



1 Er=-.-,J'(l + A,)e-^-'«^ 



^7 Er=-fe-,(l + A,)e"^"/'^-^ 
= fc^ + is:i(a,/?), (44) 

where 

00 00 
Ki(a,/3) ^ ( Y: / ( 5: e"^-^^^). 

Note that Ki{a,(3) is bounded uniformly in 7 and n. It is also easy to see that KDi is a continuous and 
increasing function of 7. This yields the first claim of the lemma. 
A similar calculation yields 

= k?^ + 2k^Ki{a,p)+K2{a,f3), 
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where 

OQ 



e 

J — — CO J — — oo 

remains bounded, uniformly in 7 and n. Hence, 

Varpi) = K2{a,(i) - Ki[a,pf 

remains bounded. In fact, we see that Var(Di) is asymptotic to the variance of a discrete Gaussian 
distribution; this distribution is non-degenerate for any finite (3. Hence, Var(Z3i) remains bounded below 
by some strictly positive constant as n goes to infinity. 

Next, we evaluate the moment generating function of Di. Proceeding as in the calculations of the 
mean and variance, we have 



„0k^ ^j = -k^ 



(45) 



where 



m - °° \^,r- (46) 



is the moment generating function of the discrete Gaussian distribution which puts mass proportional to 
e~^^ at each j £ Z. 

4.2 Proof of Lemma [2] 

We obtain using ((4T|) and (|43p that, for n large. 



"(Di > n 



l/4^ 



E?l„i/4+i/(j>7) 



F(7) 

l/4_i..l2 



- ifo(a,/?)^ 



< Ye-^^^+i^ 



Eoo 
7=0 



By the union bound 



P(D e A^) < ^P(A > n^/^) < Kne 

i=l 

which establishes the claim of the lemma. 



-0^/4 



16 



4.3 Proof of Lemma [H] 

In what follows we prove the result for the sequence Di. Following the same lines, one can prove the 
lemma for Di. 

Since EDi = fc^ + Ki{a,l3), it follows from ([45]) that 

E[e«^-] = e-«^[^ilE[e^^^] ~ ^-^^^("^^^fc^. (47) 

For fixed 0^ this is bounded uniformly in n since Ki(a,P) is so bounded, and ip does not depend on n. 
The first claim of the lemma now follows from the inequality E[e^l'^"il] < E[e^'^"i] + E[e^^'^"i]. 

Since Xni = Di — EDi, therefore Var(X„i) — Var(_Di), and the second claim of the lemma is immediate 
from Lemma [TJ 

The last claim of the lemma follows from the fact that 

oo oo 

^ min{P(X„i = j),P(X„i = j + 1)} = ^min{Ppi =.?■ + !)} 



oo 



> ^ppl-i)p(i?l=J + l) 

3=0 



/J — — OO 



> 0. 



This completes the proof of the lemma. 



4.4 Proof of Lemma [4] 



We shall bound P(D e Ai{ai,a2Y) using the moment generating function of Xni := Di — E_Di, and 
Chernoff 's bound. Observe from ([T7|) that 



where ip is defined in (|46p . Here, a and /3 are constants, and Ki{a,(3) remains bounded as n — > oo. Let 



We have 



, y/e logn + a 
2/ = ^ > J = b J- 



OO 

E 

E 

c— — c 

^ e'^^^'^'e"'^^^'"-'")' ^ g2/3(y*-j*)fc-/3fc^ 
/c— — oo 



J^-OO 

OO 

^ ^{Vei^+a)r-i3U'f ^ g(y^TH^+a)(i-i*)-/30"-(i*)") 

oo 

= e/9r(2y*-i*) ^ ^2f3{y--r)k-l3k^ 
k— — oc 

oo 



and so, 
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where K(a, /?, 6) is bounded, uniformly in n and 9. Substituting this in (|48p yields 

Ejg(V^)x„,] ^ ^^exp(^^ + «2v/^l^), (49) 

where ni and K2 may depend on a, (3, 6 and n, but are bounded. Thus, we obtain using ChernofF's bound 
that 

P(X„i > a2 log n) < Ki exp(^-^/ea^\ogn + + K2^/ 9 log rij, 

for all 9 > 0. Take 9 — 4q2/3^. Now, by the miion bound, 

n 

f(^[J{X„j > \/a2 logn}^ < Ki exp (a2/3 - 1) log n + 2k2/?a/ ^2 log i 
i=i 

The constant a2 can be chosen large enough so that q;2/3 — I > K. Hence the right hand side above 
decreases to zero faster than e~^'°s" as n — s- oo. A similar bound can be obtained on the probability 
that Xnj < —\J(X\ logn for some j e {1, . . . , n\. Thus, we have shown that, given ii' > 0, we can choose 
ai and 0.2 so that 

P ( IJ {Xa, > V«2 logn} U U {Xnj < - v/<5ilogn}j < (50) 

j=i j=i 

for all n sufficiently large. Here, Xnj — Dj — EDj, and the Dj are iid with mean clogn. Let D denote the 
empirical mean of Di, . . . , Z)„. The event, jD— EDi| > ^/r]\ogr^ is the same as the event |X„i + - • ■+Xnn\ > 
n^/rj logn. Using the same ChernofF bound techniques as above, we can show that rj can be chosen so that, 
for sufficiently large n, this event has probability at most e~^'°^"/2. Combining this with ([50]) yields the 
claim of the lemma: simply take ^/al = \fa\ + ^ and ^Jol2 — \fa2 + yjf\. 
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